Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Brushed up fairqueuing package #85259

Merged
merged 1 commit into from Nov 15, 2019

Conversation

MikeSpreitzer
Copy link
Member

@MikeSpreitzer MikeSpreitzer commented Nov 14, 2019

What type of PR is this?

Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespace from that line:

/kind api-change
/kind bug

/kind cleanup

/kind design
/kind documentation
/kind failing-test
/kind feature
/kind flake

What this PR does / why we need it:
This PR makes minor improvements to the fairqueuing package that was just introduced by #85192 . There are comments on that PR that were not addressed by the time it merged. This PR addresses those comments, except for the one bug that is fixed in #85257.

Which issue(s) this PR fixes:

Special notes for your reviewer:
This improves fresh code that is not yet used (it is behind an alpha FeatureGate that has not yet had any functionality in a release).

Does this PR introduce a user-facing change?:

NONE

(there will be a release note, in a future PR)

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

[KEP] https://github.com/kubernetes/enhancements/blob/master/keps/sig-api-machinery/20190228-priority-and-fairness.md

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Nov 14, 2019
@MikeSpreitzer
Copy link
Member Author

/cc @lavalamp
/cc @yue9944882
/cc @deads2k
/cc @aaron-prindle
/cc @mars1024
/cc @yliaog
/sig api-machinery

@k8s-ci-robot k8s-ci-robot added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Nov 14, 2019
@MikeSpreitzer
Copy link
Member Author

Oh crud, forgot to make udpate again.... standby...

@MikeSpreitzer
Copy link
Member Author

/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 14, 2019
@MikeSpreitzer
Copy link
Member Author

/unhold

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 14, 2019
@MikeSpreitzer
Copy link
Member Author

My latest force-push incorporated the make update.
Now, how do I remove my hold?

@MikeSpreitzer
Copy link
Member Author

/help

@liggitt
Copy link
Member

liggitt commented Nov 14, 2019

see https://prow.k8s.io/command-help

/hold cancel

@MikeSpreitzer
Copy link
Member Author

/honk
in honor of House House

@k8s-ci-robot
Copy link
Contributor

@MikeSpreitzer:
goose image

In response to this:

/honk
in honor of House House

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@MikeSpreitzer
Copy link
Member Author

Now that #84771 has merged, I am going to extend this PR to handle the case of zero queues.
/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 14, 2019
@MikeSpreitzer
Copy link
Member Author

/retest

@MikeSpreitzer
Copy link
Member Author

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 14, 2019
@MikeSpreitzer
Copy link
Member Author

/retest

@fedebongio
Copy link
Contributor

/assign @lavalamp

Copy link
Member

@lavalamp lavalamp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a significant improvement, thanks-- I think we still need more testing (e.g. changing settings of an in-operation queueset seems untested?)

I'm not going to block on waiting for improvements since this is a large improvement over what is already in.

/lgtm
/approve

@@ -186,14 +209,26 @@ const (
// irrelevant.
func (qs *queueSet) Wait(ctx context.Context, hashValue uint64, descr1, descr2 interface{}) (tryAnother, execute bool, afterExecution func()) {
var req *request
decision := func() string {
decision := func() decision {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this variable name makes the switch statements below confusing, can we change it to "finalDecision" or something like that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I will give the type a more descriptive name.

@@ -186,14 +209,26 @@ const (
// irrelevant.
func (qs *queueSet) Wait(ctx context.Context, hashValue uint64, descr1, descr2 interface{}) (tryAnother, execute bool, afterExecution func()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure how I missed that descr1, descr2 interface{} is being passed around; that is really weird and hard to understand, and neither the comment nor the name of the parameters helps me understand what they are supposed to be such that passing them as an interface{} makes sense. Can we fix this in a followup? Worst case, the caller can package their log identity in a single string. I think it is reasonable for the caller to pass some printable request identifier. There may also already be such a thing hidden in the context.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that we already have a trace system that prints out times when requests are slow, it uses a number to identify the request so you can see which trace line goes to which request. I think we should probably reuse that number. I think it is buried in ctx somewhere.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I understand the idea regarding the trace system. If the logging output gets just one message about a request, and that message shows a number and not the RequestInfo and User details, then it will not be helpful. Is the number only for the source code, and somehow the details appear in the log message? Or does something trigger an additional log message that associates the number with details?

I really do not want to go creating long strings that will not get logged for every request.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, what is weird and hard to understand about this? It is passing two opaque values that get logged to explain what the request is.

The goal is not to log a request identifier but rather enough details of the request that a reader of log messages can understand which request is involved.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there's (potentially) multiple messages logged about a single request, they'll be interspersed in the logs with gajillions of unrelated messages. So, a reader of the logs needs a simple way to show all messages related to the request of interest to them.

We already have a similar problem when showing which aspects of a request were slow, in the trace system. There, we print in one message the request details, and in many messages timing information. To keep the log spam down, traces only begin to print once some part of the request has been sufficiently slow.

Now, we have the additional problem of making sure that the relevant additional request information gets logged when appropriate. It actually makes the log lines harder to read and searches less precise to print out the request description with every related log line rather than printing out a request ID and the relevant information once.

(I say all this as someone who occasionally does high-stakes spelunking in apiserver logs.)

I complain about the variable names since it seems really strange to know that you need two pieces of data, but not know what they are enough to give them a name or a type. It makes this code difficult to understand apart from its usage.

I actually think logging at V(6) isn't going to be super helpful in real life, V(6) is an impractical amount of data. I'd like to find a way to log < 1 message per request, so that we can have the logs on at V2, to give operators some chance at retroactively understanding what the system was doing. Even something rudimentary like occasionally logging the trace IDs of requests that are stuck in hot queues would be useful.

We also probably want to make requests that spend time in the queue automatically trigger the trace system; maybe instead we can adjust the top-level per-request line (which is V3) to also log the amount of time spent in queues.

Here's an example use of the trace code:

trace := utiltrace.New("Update", utiltrace.Field{Key: "url", Value: req.URL.Path}, utiltrace.Field{Key: "user-agent", Value: &lazyTruncatedUserAgent{req}}, utiltrace.Field{Key: "client", Value: &lazyClientIP{req}})

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 14, 2019
@lavalamp
Copy link
Member

/lgtm
/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: lavalamp, MikeSpreitzer

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

1 similar comment
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: lavalamp, MikeSpreitzer

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 14, 2019
This commit responds to the comments on PR kubernetes#85192 that were not yet
addressed at the time it merged, apart from the one fixed in PR

Generalized fairqueuing to allow for zero queues, to support a
priority level that limits concurrency but does no queuing.
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 15, 2019
@MikeSpreitzer
Copy link
Member Author

I force-pushed to rebase on master (which was required due to conflict with another recent change) and respond to the review comments.

@yue9944882
Copy link
Member

/test pull-kubernetes-integration
/lgtm

relgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 15, 2019
@k8s-ci-robot k8s-ci-robot merged commit 83c1d70 into kubernetes:master Nov 15, 2019
@k8s-ci-robot k8s-ci-robot added this to the v1.17 milestone Nov 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/apiserver cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. release-note-none Denotes a PR that doesn't merit a release note. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants